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SKEW DETECTION 
The present invention relates to the field of image processing and, more 
particularly, to the detection or estimation of skew in document images. 

The automatic processing of document images, typically by computers, is 
now widespread and is performed for a variety of reasons including, for example, 
optical character recognition. Often there are problems in the automatic 
processing because the document image is skewed, and it is advisable to detect or 
estimate the skew angle, and correct for skew, before applying the further image 
processing. Incidentally, in the present document the expressions "skew 
detection" and "skew estimation" are both used to designate the process of 
determining a value for skew angle: the term "estimation" does not denote a lower 
level of accuracy in determining such a value. 

Various techniques have been proposed for automatic skew detection in 
document images. These are usually methods based on clustering of nearest 
neighbours, methods based on Hough transforms, or methods involving 
determination of projection profiles. However, these methods suffer from a 
number of drawbacks. Often the skew estimation/detection process is slow. Also, 
few methods are applicable to grey-scale images or to images containing 
drawings. Moreover, most known methods can give inaccurate results when 
applied to analysis of documents with text in non- Western scripts (for example in 
Devnagari and Bangla scripts). 

It has been proposed to use techniques derived from mathematical 
morphology in an algorithm for skew detection in a document image, see for 
example, the paper "A fast algorithm for skew detection of document images 
using morphology" by A.K. Das and B. Chanda from IJDAR, International 
Journal on Document Analysis and Recognition, (2001) 4, pages 109-114. 
According to this proposal, the morphological operations of "closing" and 
"opening" (or "dilation" and "erosion") are applied to a document image in order 
to convert text lines into black bands. Subsequently, the black bands are analysed 
in order to find the baseline pixels of each text line, lines of a certain length are 
extracted and the orientation angles thereof are computed. The median angle is 
taken to represent the skew angle. 

Although the algorithm proposed by Das and Chanda is fast, and is 
applicable to a wide variety of script forms, it is not well-suited to processing 
documents containing drawings as well as text. Special steps must be included in 
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the Das and Chanda algorithm in an attempt to minimise the effect of drawings on 
the skew-angle-estimation process. 

The present invention seeks to provide a new technique for skew 
estimation based on mathematical morphology. 
5 The principles of mathematical morphology were laid down in the 1960s 

by G. Matheron and J., Serra. When applied to image analysis, mathematical 
morphology provides a framework for analysing the shape and form of structures 
present in the image. Many mathematical morphological operations make use of a 
probe, or "structuring element", to investigate the structure of the image under 

10 analysis. The shape and size of the structuring element must be adapted to the 
geometric properties of the image objects to be processed. For example, linear 
structuring elements are suited to the extraction of linear objects in an image. 

Set notation is often used to express mathematical morphological 
operations. The structuring element is often denoted by the set of points, B, 

15 which constitutes it. When the structuring element is translated onto a point x, 
then it is written B x . For a black-and-white image, the set of all white pixels in 
the image describes the image (the same is true for the set of all black pixels in the 
image). Such a set can be considered to be an image object, F. A corresponding 
image object, f, can be defined for a grey-scale image. There is no formal 

20 difference between morphological operations whether applied to binary or grey- 
scale images. 

For mathematical morphology on grey-scale images different equivalent 
approaches can be taken. A simple idea is to look at the "umbra" of the function, 
that is the set {(y,x)|y<f(x)} and to apply the usual set operators on this set. 
25 Generally, for grey-scale images, planar structuring elements are used (for 
instance a disk would be used in place of a sphere). Thus, the function is 
considered level set by level set. 

Another approach is to define morphological operators using a generalised 
expression which applies to grey-scale images. For example, the expression for a 
30 dilation operation would become: 

f 0 B(x) = sup f(x+y) (1.) 
yeB 

and, a binary image would then correspond to the special case where f(x)=l if 
xeX and f(x)=0 elsewhere. 
35 In the following description, when a binary image is involved the symbol 

F will be used to designate the image object, when a grey-scale image is involved 
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then symbol f will be used, and when the image object can be either grey-scale or 
binary the symbol A will be used. 

It may be helpful to recall some of the basic operations used in 
mathematical morphology, notably the operations of dilation, erosion, opening 
5 and closing. 
Dilation 

The operation of "dilation" seeks to answer the question "When a 
structuring element, B, is translated onto a point x does it intersect with the set 
defining the image object A T 9 The dilation of an image object A using a 
10 structuring element B can be written 813(A). An image object can be repeatedly 
dilated. If dilation is repeated n times then it is said that a dilation of size n has 
been performed, and the result is written as 6n3(A). 

In set notation, the dilation of an image can be expressed in terms of 
Minkowski addition which, for a binary image F gives: 
15 813(F) = F©B= {x|B x nF*0} (2.) 

In other words, the dilated image 613(F) will contain image points (typically, 
black pixels) at all points x for which there is an intersection between the original 
image F and the structuring element when translated onto x (B x ). 

For a grey-scale image, f, the dilation of the image by the structuring 
20 element B can be expressed, in a similar way, as: 

813(f) = (f©B)(x) = max f(x+b) (3.) 

beB 

In other words, for a point x, the value of this point in the dilated image will be 
the maximum of the values taken at the points (x+b) in the original grey-scale 
25 image f, b representing the vectors defining the points in the structuring element 
B. 

Considered visually, dilation can be likened to adding a layer to objects 
represented in the image. A dilation of size n adds n layers to the objects. 
Erosion 

30 Erosion is the complement to dilation. The operation of "erosion" seeks to 

answer the question "When a structuring element, B, is translated onto a point x is 
the structuring element completely contained in the set defining the image object 
A T 9 The erosion of an image object A using a structuring element B can be 
written £i3(A). An image object can be repeatedly eroded and e^A) denotes an 

35 image A that has been eroded n times. 

In set notation, the erosion of an image can be expressed in terms of 
Minkowski subtraction which, for a binary image F, gives: 



4 



£i,b(F) = F0B = {x|B x cF} (4.) 
In other words, the eroded image 813(F) will contain image points at all points x 
for which, when the structuring element is translated onto x it is completely 
contained within the original image object. 
5 For a grey-scale image, f, the erosion of the image by the structuring 

element B can be expressed, in a similar way, as: 

£i,B(f) = (f 0 B)(x) = min f(x+b) (5.) 

beB 

In other words, for a point x, the value of this point in the eroded image will be 
10 the minimum of the values taken at the points (x+b) in the original grey-scale 
image f, b representing the vectors defining the points in the structuring element 
B. 

Considered visually, erosion can be likened to stripping off a layer from 
objects represented in the image. 
15 Opening 

The opening operation consists of an erosion followed by a dilation (this is 
not equivalent to a dilation followed by an erosion - see "Closing" below). If an 
image A is opened by a structuring element B, then the result Yi,b(A) can be 
expressed in a variety of ways: 
20 Yi,b(A) = A°B = A B = (A 0 B) 0 B (6.) 

(the first three expressions are just different symbolic representations of "A closed 
by B", the final expression indicates an erosion followed by a dilation) 

Application of the opening operator to an image tends to smooth the 
contours of objects in the image, to separate an "isthmus" in the image from the 
25 "mainland" (if the link between the two is smaller than the structuring element), 
and to remove objects (or their parts) which are smaller than the structuring 
element. 
Closing 

The closing operation consists of a dilation followed by an erosion. The 
30 closing operation is the dual operation (not the inverse) of the opening operation. 
If an image A is closed by a structuring element B, then the result 913(A) can be 

expressed in a variety of ways: 

913(A) = A«B = A B = (A 0 B) 0 B (7.) 
Application of the closing operator to an image tends to close holes or slits 
35 in the image if they are smaller than the structuring element and to cause the union 
of "islands" to the "mainland" when the distance between them is shorter than the 
structuring element. 
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The preferred embodiments of the present invention make use of operators 
from mathematical morphology in order to estimate skew in a document image in 
a new way. 

The preferred embodiments of the present invention provide a skew 
5 estimation method which is robust, fast, is applicable to document images 
containing text in a variety of scripts, is applicable to grey-scale as well as black- 
and-white images, and which is not unduly affected by the presence of drawings. 

More particularly, the present invention provides a method of estimating 
skew in a document image, the method comprising the steps of: run-length- 
10 smoothing the document image; and detennining the erosion of the run-length- 
smoothed image by a linear structuring element oriented at each of a plurality of 
different angles whereby to determine the angle at which the surface area of the 
eroded image is maximum, said angle being designated as the skew angle of the 
document image. 

15 In view of the fact that the erosion of an image by a structuring element 

results in the set of points where the structuring element can be translated and still 
be contained within the pre-erosion image, it can be understood intuitively that the 
eroded image will have a maximum surface area when the structuring element is a 
linear element aligned with the predominant direction of lines within the pre- 

20 erosion image. Thus, the predominant angle of lines within an image can be 
determined by varying the orientation of a linear structuring element used to erode 
the image, and detecting the angle at which the eroded image has a maximum 
surface area. In a skewed document image containing text, this predominant 
angle tends to be the angle of skew. 

25 The skew estimation method of the present invention works well for both 

binary (typically black-and-white) images and for grey-scale images. Moreover, 
the method according to the present invention provides one of the fastest skew- 
estimation algorithms known to date. 

Preferably, the document image is run-length-smoothed by closing the 

30 document image using a linear structuring element. In the field of mathematical 
morphology the expression "run-length-smoothing" would generally be 
understood to refer to smoothing using a structuring element oriented at an angle 
of 0°. However, in the present document "run-length-smoothing" is not limited 
by reference to any specific orientation of the structuring element. 

35 Advantageously, a plurality of different run-length-smoothed images are 

produced by closing the document image using a linear structuring element 



oriented at respective different angles. In this case, the step of eroding the run- 
length-smoothed image comprises eroding each of the different run-length- 
smoothed images using a linear structuring element oriented at the same angle as 
the linear structuring element that was used when producing that run-length- 
smoothed image. 

It is to be understood that in the present document the expression "linear 
structuring element" is not limited to a line-shaped segment. For example, the 
linear structuring element used to erode the run-length smoothed image(s) can 
consist of a pair of points having a particular angular relationship. In such a case, 
the determination of how the surface area of the eroded image varies with varying 
angular orientation of the linear structuring element approximates to a 
determination of the rose of directions for the image, or the covariance of the 
image. The "rose of directions" function, p(a), can be considered to be a function 
indicating the probability that lines in the image are oriented at a particular angle, 
a. 

Rather than calculate the surface area of the eroded run-length-smoothed 
image for all possible angles of the structuring element, the search for the angle 
corresponding to maximum surface area in the eroded image can be speeded up by 
using a one-dimensional optimisation algorithm. Preferably the image is sub- 
sampled before applying such an algorithm. 

A large number of calculations are involved in performing the various 
dilation and erosion operations in the skew estimation method of the present 
invention. In order to reduce the computational burden, a recursive algorithm can 
be used to perform these operations, when a grey-scale image is being processed. 
These operations can also be performed for binary images using currently- 
available devices implementing Fast Fourier Transforms. 

When the skew estimation method of the present invention is applied to a 
binary document image, computation can be speeded up by performing a 
logarithmic decomposition of the structuring element, and employing parallel 
processing to perform the dilation and erosion operations. More particularly, w 
pixels of the document image can be allocated to a w-bit data word a logical 
operator can be simultaneously applied to the w pixels using a bitwise operator. 
In such a case the speed of the skew estimation method can be evaluated 
according to the following expression: 

O((log(k0 + log(ki)log(k 2 ))nm/w) (8.) 
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where ki is indicative of the length of the structuring element used in the run- 
length-smoothing step, k 2 is indicative of the length of the structuring element 
used in the eroding step, and nm is the number of pixels in the document image. 

The present invention also provides apparatus adapted to put into practice 
5 the above-described method. This apparatus can comprise a general-purpose 
computer programmed to implement the method according to the invention. 

The present invention yet further provides a computer program product 
having a set of instructions to cause, when in use on a general-purpose computer, 
said computer to perform the steps of the skew-estimation method according to 
10 the present invention. 

The above and other features and advantages of the present invention will 
become clear from a reading of the following description of preferred 
embodiments thereof, given by way of example, taken in conjunction with the 
accompanying drawings, in which: 
15 Figure 1 illustrates the effect of run-length-smoothing then erosion on a 

skewed document image; and 

Figure 2 shows how surface area of an eroded run-length-smoothed 
document image varies with the angle of the structuring element used in the 
erosion. 

20 The following description of the skew-estimation method of the present 

invention will be given in terms of a preferred embodiment in which the document 
image being processed contains only text. However, it is to be understood that the 
method is applicable to document images which contain drawings as well as text. 
The presently-preferred embodiment of skew-estimation method according 

25 to the present invention has two main steps: 

1. a run-length-smoothing algorithm is applied to the document image; 
and 

2. the probability that lines in the run-length- smoothed image are at a 
given angle is investigated, for different angles, by determining the 

30 surface area of the run-length-smoothed image when eroded using a 

linear structuring element oriented at these different angles. 
The method of the present invention can also be extended so as to include 
not only skew estimation but also skew correction. 
Run-length Smoothing 
35 In the run-length-smoothing step of the skew-estimation method according 

to the present invention, the document image A can be run-length smoothed by 
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closing the image A using a linear structuring element. Advantageously, a 
structuring element kiLo is used, which is a horizontal linear segment (Lo is a 
horizontal linear segment of length unity, ki is a scaling parameter). It is believed 
that the value of the scaling parameter ki is not critical. For text documents, ki is 
5 preferably approximately the same size as a typical word in the text. In an 
appropriate case, this size could be evaluated from the dpi of the scanner 
generating the document image. Alternatively, it can be computed, for instance 
by computing the size of englobing boxes for all the connected components (i.e. 
the letters) present in the black-and-white image. However, a suitable level of 
10 accuracy in the skew estimation can be obtained, and the overall method can be 
rendered faster, by setting a predetermined value for ki. 

The image resulting from applying a run-length-smoothing algorithm 
consisting of closing image A using the structuring element kiLo can be denoted 
by RLSAo(A), and: 

15 RLSAo(A) = (A 0 k!Lo) G kiLo (9.) 

Application of this run-length-smoothing algorithm tends to blur the words in a 
text line into blobs which merge together into a black band - this process being 
most successful in merging the words on a text line in a document where there is 
no skew. 

20 However, a run-length-smoothed image can also be obtained by closing 

the document image A using a linear structuring element kiL a oriented at any 
chosen angle a. In other words, we can calculate: 

RLSAaCA) = (A 0 kiU) 0^ (10.) 
This process will be most successful at merging words in a text line into a band in 

25 the case where the angle a of the structuring element is the same as the document 
skew angle. Thus, according to the presently-preferred embodiment of the present 
invention, the run-length-smoothing step is performed to calculate RLSA a (A) for 
a plurality of different values of a. Usually document skew angle is within fairly 
small range of angles (typically ±15°), so it is often sufficient to calculate 

30 RLSA a (A) values for a in the range ±15°. Alternatively, to give a margin for 
error, it can be useful to calculate RLSA a (A) values for a in a range somewhat 
broader than the expected range of skew angle (for example, ±17° or ±20°). 
Calculating RLSAa(A) values for too broad a range of a values would 
disadvantageously increase the time required for computation. 

35 It could be envisaged to apply a dilation, rather than a closing operation, to 

the document image during this stage of the method according to the invention. 



However, this is less desirable because it results in a less accurate skew angle 
estimate and is slower to implement. 
Investigating Line Orientation 

When an image A is eroded using a linear structuring element k 2 L a 
oriented at an angle a, the result has a maximum surface area when the orientation 
angle a of the structuring element matches the predominant angle of lines in the 
image A. Thus, a function p(a) can be defined, as follows: 

p(ot) = surface area of (A 0 k2La) (1 1 .) 

where k 2 is a scaling factor, and this function p(oc) will have a maximum value at 
an angle a corresponding to the predominant angle of lines in the image P. As for 
the scaling parameter ki, the value of the scaling factor k 2 is not critical. 
However, it should be sufficiently larger than k x . A suitable value is, for example, 
of the order of 10 times the size of a typical word in a text document. 

Thus, preferred embodiments of the present invention determine skew 
angle in a document image by determining the angle at which there is a maximum 
in the function p(ot) calculated for the run-length-smoothed document image. 
This angle should correspond to the predominant angle of lines in the document 
image. 

We could calculate p(oc) = surface area of (RLSAo(A) © k 2 L a ), and look 
for the maximum of this function. However, this would only give an accurate 
skew angle estimate for small skew angles, and it would be relatively slow to 
compute. The presently-preferred embodiment of the invention calculates: 

p'(oc) = surface area of (RLSA a (A) 0 k 2 L a ) = (12.) 

p'(a) = surface area of { [(A © kiL«) © kjL a ] © k 2 L a } (13.) 
In other words, to determine the function p'(tx) a plurality of run-length-smoothed 
images, each generated using a linear structuring element at a respective angle oq, 
are each eroded using a respective linear structuring element oriented at the 
corresponding angle oq. The angle at which p*(a) has a maximum is the 
estimated skew angle. 

The above expression (13) for p'(oc) requires computation of the surface 
area of an entity { [(A © kiL«) © kjLa] © k 2 L a } resulting from performance of a 
closing operation (A © kiL a ) © k x La followed by an erosion © k 2 L a . However, 
because of the associative nature of morphological operators, this entity is also 
equal to the result of performing a dilation A © kiL« followed by an erosion © 
(k!+k 2 )L a . This latter process is quicker to compute. Accordingly, preferred 
embodiments of the present invention compute the following expression: 
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p'(a) = surface area of [A 0 kiL«] © [(ki+k 2 )L a ] (14.) 

Test Results 

Fig.l shows a document image A (Fig. la)) and illustrates the result of run- 
length-smoothing then eroding this image using structuring elements oriented at 
5 different angles. The document image of Fig. 1 a) has a skew angle of -3°. 

More particularly, Fig. lb) illustrates the result of run-length smoothing the 
image A of Fig. la) by closing that image using a linear structuring element 
oriented at 0°, then eroding this run-length-smoothed image RLSAo(A) using a 
linear structuring element k 2 Lo oriented at 0°. Fig.lc) illustrates the result of run- 
10 length smoothing the image of Fig. la) by closing that image using a linear 
structuring element oriented at +1°, then eroding this run-length-smoothed image 
RLSAi(A) using a linear structuring element k 2 Li oriented at +1°. Fig. Id) 
illustrates the result of run-length smoothing the image of Fig. la) by closing that 
image using a linear structuring element oriented at -3°, then eroding this run- 
15 length-smoothed image RLSA- 3 (A) using a linear structuring element k 2 L_ 3 
oriented at -3°. 

It will be seen from Fig.l that, as the angle of the structuring element 
approaches the correct skew angle, the run-length-smoothed and eroded image has 
darker, thicker bands. Indeed, the processed image having the darkest, thickest 

20 bands is shown in Fig. Id), which corresponds to the original document image run- 
length smoothed and eroded using linear structuring elements oriented at the skew 
angle. This image will have the greatest surface area, as is illustrated by Fig.2. 

Fig.2 is a graph showing how the surface area of the run-length-smoothed 
and eroded images of Fig.l vary with the angle a. It will be seen that the function 

25 p'(a) has a maximum at the angle a = -3°. Thus, the method of the presently- 
preferred embodiment of the present invention yields a skew angle estimate of - 
3°. 

It will be seen from Figs.l and 2 that the skew-estimation method of the 
present invention is effective to determine the skew angle of a document image. 

30 Moreover, tests have been performed using the method according to the 

present invention, with calculations being implemented by a Pentium HI ®, 733 
MHz computer estimating skew in a document image measuring 1214x1151 
pixels. Even though the program had not been specifically optimised, an accurate 
skew estimate was produced in less than 0.75 seconds. If the program had been 

35 optimised using known programming techniques, as is preferred according to the 
present invention, then the calculation time would have been further reduced. 
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Thus, it is apparent that the skew-estimation method of the present invention is 
amongst the very fastest known. 
Computation of the Skew Angle Estimate 

When implementing the skew angle estimation method of the present 
invention there are numerous simplifications and approximations that can be made 
in order to speed up computation. 

It should first be noted that although the invention has been presented in 
terms of a two-step process, in practice the two steps can be integrated. In other 
words, the invention is not limited to the case where all run-length smoothing 
operations are performed first and then all erosion operations are performed 
subsequently. Notably, as mentioned above, by taking advantage of the 
associative nature of morphological operations the method can be speeded up by 
calculating the expression (14). 

Further, when determining the function p'(cc) (or p(a)) for a particular 
document image, rather than calculate the value of this function for a large 
number of individual values of oc, a one-dimensional optimisation algorithm can 
be used in order to reduce the number of individual values of p'(ct) (or p(a)) that 
need to be computed. A suitable level of accuracy in the skew angle estimate can 
be obtained using Brent's method described in "Numerical Recipes" by W.H. 
Press, B.P. Flannery, S.A. Teukolsky and W.T. Vetterling, published by 
Cambridge University Press, 1989, pp.283-6. 

Brent's method is a kind of parabolic interpolation in which the values of 
six parameters a, b, «, v, w and jc, are monitored. The parameters a and b are the 
limits of a bounding interval in which the minimum is located, x is the point with 
the lowest function value found so far, w is the point with the second lowest 
function value found so far, u is the point at which the function was evaluated 
most recently, and v is the previous value of w. The method is iterative. 

According to Brent's method, parabolic interpolation is attempted fitting 
through the points x, v and w. In order to be considered acceptable, the current 
parabolic-interpolation step must (i) produce a new minimum which falls within 
the bounding interval (a,b) 9 and (ii) imply a movement (amount of change) from 
the best current value, x, that is less than half the movement of the step before last. 
This second criterion ensures that the successive steps of the method will lead to 
convergence. In the worst case, where successive steps approximately alternate 
between parabolic steps and golden sections, there will ultimately be convergence 
thanks to the golden sections. 
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Preferably, before applying the above-described algorithm according to 
Brent, the document image is sub-sampled so as to reduce the required 
computation time. It is to be noted that the sub-sampling operation can be 
performed simultaneously with the dilation operation. 
5 Moreover, it will be seen that a large number of dilation and erosion 

operations need to be performed when implementing the skew-estimation method 
of the present invention. For example, the raw algorithm for computing erosion 
or dilation of a grey-scale image consists in calculating a minimum or maximum 
value from amongst a number of pixels equal to the number of pixels in the 

10 structuring element, for each pixel of the image. For a structuring element of n 
pixels, there are thus n-1 min/max comparisons per image pixel. This number of 
calculations can be drastically reduced, thus reducing the overall computation 
time, by using appropriate algorithms and data structures. Similarly, 
implementation of dilation and erosion operations in the method of the present 

15 invention in general can be optimised by use of appropriate algorithms and data 
structures. Some preferred techniques are discussed below. 
For Skew Estimation in Binary Images 

Dilation and erosion operations can be performed using a Fourier 
transform, as explained in "Mathematical morphology and convolution'* by 

20 J.E.Mazille published in the Journal of Microscopy, 156(1):3-13, October 1989, 

- - ~ ~ and in "Morphological filtering using a Fourier Transform hologram" by M. 
Killinger, J.L. de Bougrenet de la Tocnaye, P. Cambon and C. Le Moing, 
published in Optics Communications, 73(6):434-438, November 1989. The skew- 
estimation method of the present invention can thus be implemented in a rapid and 

25 efficient manner by making use of currently-available Fast Fourier Transform 
devices to perform the dilation and erosion operations required by the method 
according to the invention, in the manner explained by Mazille and Kilinger et al. 

Moreover, the property of associativity of morphological operations 
mentioned above can be used in conjunction with a logarithmic decomposition of 

30 the (convex) structuring element. In particular, it is possible to decompose a 
convex set using a logarithmic expression, based on a definition of extreme sets of 
a convex set. The relevant definition of extreme sets is given in "Speeding up 
successive Minkowski operations" by J. Pecht, in Pattern Recognition Letters, 
3(2): 113-1 17, 1985. In our case, a line-shaped structuring element can be 

35 decomposed into a well-chosen sequence of points. When dealing with images 
defined on a grid, a line-shaped segment of length 1 is reduced to a pair of points 
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close to each other on the grid. When dealing with longer line-shaped segments, 
it is not obligatory to consider each point on the projection of the segment on the 
grid. 

Furthermore, dilation and/or erosion operations can be applied in parallel 
to the various bits of the binary image. Since w pixels of a binary image can be 
represented using a w-bit data type word, a logical operator implementing 
dilation/erosion can be simultaneously applied to w pixels of the image using a 
bitwise operator. In other words, on a machine using 32-bit data-words, 32 pixels 
of the image can be processed in one machine cycle. This technique is described 
in detail in the PhD thesis "Mathematical morphology: extension towards 
computer vision" by R. van den Boomgard, Amsterdam University, 1992, and in 
the paper "Methods for fast morphological image transform using bitmapped 
binary images" by R. van den Boomgaard and R. van Balen in Computer Vision, 
Graphics and Image Processing: Graphical Models and Image Processing, 
54(3):252-258,1992. 

When using an approach combining the logarithmic decomposition of the 
structuring element with parallel processing of image pixels, the speed of the skew 
estimation can be evaluated by computing the expression: 

0(Gog(k!) + log(ki)log(k 2 ))nm/w, (8) 
where ki and k 2 are the scaling parameters of the run-length-smoothing and 
erosion operations, nm is the number of bits in the image (it is an image of 
dimension n pixels by m pixels), and w is the number of bits in the data-word, and 
then using a hash table to compute the surface area of the result. 

For Skew Estimation in Grey-Scale Images: 

When calculating dilations and erosions of a grey-scale image using a 
structuring element which is a line segment, the number of minimum/maximum 
comparisons per image pixel can be reduced to 3, regardless of the length of the 
line segment, using a recursive algorithm proposed by M. van Herk in "A fast 
algorithm for local minimum and maximum filters on rectangular and orthogonal 
kernels'*, published in Pattern Recognition Letters, 13:517-521, 1992. This 
algorithm can be applied when calculating dilations and erosions involving a 
linear structuring element oriented at any angle, as explained in "Recursive 
implementation of erosions and dilations along discrete lines at arbitrary angles" 
by P. Soille, EJ. Breen and R. Jones, published in IEEE Transactions on PAMI, 
18(5):562-566, 1996. It is advantageous for the present invention to make use of 
these recursive algorithms when performing dilations and erosions. 
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It is also noted that a new algorithm for computing dilation/erosion at 
arbitrary angles has recently been proposed in "Directional Morphological 
Filtering" by P. Soille and H. Talbot in IEEE Transactions on Pattern Analysis 
and Machine Intelligence, 2001, vol.23, no.ll. This algorithm may be used in 
5 implementing the method according to the present invention. 
The Structuring Element 

In the description above, it is stated that the run-length-smoothing step and 
line-direction investigation step of the present invention make use of a linear 
structuring element. It is to be understood that this can be a line segment, but that 
10 it can also be other structures which have a main direction. For example, in the 
line-direction investigation step, it is also possible to use a structuring element 
k 2 Pi, v , where Pi, v can be derived from the following expression: 

Px,v= |> (150 

It will be understood that this structuring element consists of a pair of points [(0,0) 
15 and (k 2 cosa,k 2 sina)] separated by fixed distance k 2 and having a relative 
orientation that can be described using angle a. As a further example, in the line- 
direction investigation step a structuring element corresponding to a rectangle can 
be used, having the longest line borders thereof oriented at a given angle a (this 
angle a then being varied, as described above). Other examples will readily occur 
20 to the person skilled in this field. 

Interestingly, the surface area of erosions by a pair of points separated by a 
fixed distance but with varying orientations are sometimes represented in a polar 
diagram which is called a "rose of directions". This is the curve of (p(a),a) for a 
taking values 0 to 360°. Thus, the line-direction investigation step of the present 
25 invention is similar to determining the rose of directions (given by equation (11) 
above) for the run-length-smoothed image. 

Also, the covariance K of an image A is calculated by measuring the 
volume (or the surface area) of the image A eroded by a pair of points Pi fV - More 
particularly: 

30 K(A;Pi, v ) = Vol(A 0 Pi, v (A)) (16.) 

For binary images, F, this expression reduces to: 

K(F;Pi, v ) = Surface Area(F n F v )) (17.) 
Which is the same as the rose of directions. 
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In view of the above, calculation techniques known for determining the 
rose of directions and for determining the covariance of an image can be adapted 
for use in the present invention. 
Skew Correction 

Once the skew angle of a document image has been estimated/detected, it 
is a straightforward matter to correct for skew automatically, for example by 
implementing a simple rotation algorithm. To calculate the correct value for a 
pixel at a location (x,y) in the skew-corrected image, the original position 
(x 0 id»yoid) of the corresponding pixel in the skewed image is calculated using the 
following equations: 

Xo ld = xcosoc + y since 
Yoid = ycosa - xsina (18.) 
Where a is the estimated skew angle of the document image. However, (x 0 id,y 0 id) 
rarely corresponds to a pixel location in the skewed image, so it is usually 
necessary to interpolate between the values of the surrounding pixels in the 
skewed document, by taking a weighted average where the weights depend upon 
the proximity of the respective surrounding pixels to the location (x old ,y old ). 

As indicated above, the present invention also provides apparatus for 
implementing the above-described methods. Typically, this is a suitably- 
programmed general-purpose computer. However, it is also possible to use 
dedicated hardware to implement the method. 

Various modifications and developments can be made in the detailed 
embodiments described herein without departing from the scope of the present 
invention as described in the appended claims. 
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CLAIMS: 

1. A method of estimating skew angle in a document image, the method 
comprising the steps of: 

run-length-smoothing the document image (A); and 

determining the erosion of the run-length-smoothed image (RLSA) by a 
linear structuring element (k 2 Lo) oriented at each of a plurality of different angles 
(a) whereby to determine the angle at which the surface area of the eroded image 
is maximum, said angle being designated as the skew angle of the document 
image. 

2. The skew estimation method of claim 1, wherein the step of run-length- 
smoothing the document image comprises closing the document image using a 
linear structuring element (kiL). 

3. The skew estimation method of claim 2, wherein: 

the step of run-length-smoothing the document image (A) comprises 
producing a plurality of different run-length-smoothed images (RLSAo), each of 
said different run-length-smoothed images (RLSAo) being produced by closing 
the document image (A) using a linear structuring element (k^o) oriented at a 
respective one (cti) of said of plurality of different angles; and 

the step of eroding the run-length-smoothed image comprises eroding each 
of said plurality of different run-length-smoothed images (RLSA a ) using a linear 
structuring element (k 2 L a ) oriented at the same angle (cti) as the linear structuring 
element used in the closing operation producing the respective run-length 
smoothed image (RLSAa). 

4. The skew estimation method of claim 1, 2 or 3, wherein the, or each, linear 
structuring element applied in the eroding step consists of a pair of points (Pi tV ) 
having a particular angular relationship. 

5. The skew estimation method of any previous claim, wherein the eroding 
step comprises determining the covariance (K) of the run-length-smoothed image. 

6. The skew estimation method of any previous claim, wherein the eroding 
step comprises applying a one-dimensional optimisation algorithm whereby to 
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determine the angle at which the surface area of the eroded image is a maximum 
whereby to reduce the number of angles at which the erosion of the run-length- 
smoothed image is calculated. 

5 7. The skew estimation method of claim 6, and comprising the step of sub- 
sampling the document image before applying the one-dimensional optimisation 
algorithm. 

8. The skew estimation method of any previous claim, applied to a grey scale 
10 document image, wherein a recursive algorithm is used to perform dilation and 

erosion operations in the run-length-smoothing and eroding steps. 

9. The skew estimation method of any previous claim, applied to a binary 
document image, wherein the linear structuring element is decomposed 

15 logarithmically, and dilation and/or erosion operations are performed using 
parallel processing of the pixels of the document image. 

10. The skew estimation method of any previous claim, wherein Fast Fourier 
Transforms are used to perform dilation and erosion operations in the run-length- 

20 smoothing and eroding steps. 

11. Skew angle estimation apparatus comprising: 

means adapted to run-length-smooth a document image (A); and 
means adapted to determine the erosion of the run-length-smoothed image 
25 (RLSA) by a linear structuring element oriented at each of a plurality of different 
angles whereby to determine the angle at which the surface area of the eroded 
image is maximum, said angle being designated as the skew angle of the 
document image. 

30 12. The skew estimation apparatus of claim 11, wherein the run-length- 
smoothing means is adapted to close the document image using a linear 
structuring element. 

13. The skew estimation apparatus of claim 12, wherein: 
35 the run-length-smoothing means is adapted to produce a plurality of 

different run-length-smoothed images (RLSAJ, each of said different run-length- 
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smoothed images (RLSA a ) being produced by closing the document image (A) 
using a linear structuring element oriented at a respective one (a) of said of 
plurality of different angles; and 

the eroding means is adapted to erode each of said plurality of different 
5 run-length-smoothed images (RLSA a ) using a linear structuring element oriented 
at the same angle (a) as the linear structuring element used by the run-length- 
smoothing means in producing the respective run-length smoothed image 
(RLSAa). 

10 14. The skew estimation apparatus of claim 11, 12 or 13, wherein the, or each, 
linear structuring element applied by the eroding means consists of a pair of points 
having a particular angular relationship. 

15. The skew estimation apparatus of any one of claims 11 to 14, wherein the 
15 eroding means comprises means adapted to determine the covariance (K) of the 

run-length-smoothed image. 

16. The skew estimation apparatus of any one of claims 11 to 15, wherein the 
eroding means comprises means applying a one-dimensional optimisation 

20 algorithm to determine the angle at which the surface area of the eroded image is a 
maximum whereby to reduce the number of angles at which the erosion of the 
run-length-smoothed image is calculated. 

17. The skew estimation apparatus of claim 16, and comprising sub-sampling 
25 means adapted to sub-sample the document image before the one-dimensional 

optimisation algorithm is applied. 

18. The skew estimation apparatus of any one of claims 11 to 17, wherein the 
run-length-smoothing means and eroding means are adapted to use a recursive 

30 algorithm to perform dilation and erosion operations when the document image is 
a grey-scale image. 

19. The skew estimation apparatus of any previous claim, and comprising 
parallel processing means for allocating w pixels of the document image to a w-bit 

35 data word and applying a dilation and/or erosion operation to the w-bit data word 
using a bitwise operator 
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20. The skew estimation apparatus of any one of claims 11 to 19, and 
comprising Fast Fourier Transform Units to perform dilation and erosion 
operations required by the run-length-smoothing means and eroding means. 

21. Skew estimation apparatus according to any one of claims 11 to 20 
implemented as a specially-programmed general purpose computer. 



22. A computer program product having a set of instructions to cause, when in 
10 use on a general-purpose computer, said computer to perform the steps of the 
skew-estimation method of any one of claims 1 to 10. 
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SKEW DETECTION 
Skew angle in a document image (A) is estimated using operators known 
from mathematical morphology. Skew angle in a document image (A) is 
estimated by run-length smoothing the image and then producing a plurality of 
5 eroded run-length-smoothed images. The run-length-smoothed image (RLSA(A)) 
is eroded using a linear structuring element ^La) oriented at each of a plurality 
of different angles (a). The angle of the linear structuring element which 
produces an eroded image having the greatest surface area is designated as the 
skew angle. A plurality of run-length-smoothed images (RLSAa(A)) may be 
10 produced, each generated by smoothing the document image using a linear 
structuring element (kiL a ) oriented at a respective different angle (ocj). Then each 
run-length smoothed image (RLSA a (A)) is eroded using a linear structuring 
element oriented at the corresponding angle (oq). 
(Fig.l) 
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(a) Original image 



(b)RLSAo (A)ek 2 Lo 





(c) RLSAi(A) 0 k 2 U 
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